forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 2
[pull] main from llvm:main #5651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pull
wants to merge
863
commits into
Ericsson:main
Choose a base branch
from
llvm:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+140,245
−51,053
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…__support/math folder. (#162019) Part of #147386 in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
…n-overaligned-type` (#166546)
…ster (#164479) Technically, it is possible that the a callee-saved register is saved in different locations. CFIInstrInserter should handle this, but currently it does not.
Use cannotHoistOrSinkRecipe to forbid sinking allocas.
When ISL encounters an internal error, it sets the error flag, but it is not isl_error_quota that was already checked. Check for general errors and abort the schedule optimization if that happens, instead of continuing on the good path. The error occured when compiling llvm-test-suite's MultiSource/Applications/JM/lencod/leaky_bucket.c with Polly enabled. Not adding a test case because it depends on ISL internals. We do not want to a test case to depend on which version of ISL is used.
Currently, the ARM backend incorrectly parses every `arm` prefixed arch to be non-thumb, but `armv6m` is THUMB and doesnt have ARM ops causing the test to fail when compiling to assembly and not LLVM IR: `error: Function 'foo' uses ARM instructions, but the target does not support ARM mode execution.` This only happens when invoking cc1 directly and not the Clang driver. As a quick triage, this patch changes the tests to use `thumb`. Uncovered by #151404
…160536) As discussed in #153402, we have inefficiences in handling constant pool access that are difficult to address. Using an IR pass to promote double constants to a global allows a higher degree of control of code generation for these accesses, resulting in improved performance on benchmarks that might otherwise have high register pressure due to accessing constant pool values separately rather than via a common base. Directly promoting double constants to separate global values and relying on the global merger to do a sensible thing would be one potential avenue to explore, but it is _not_ done in this version of the patch because: * The global merger pass needs fixes. For instance it claims to be a function pass, yet all of the work is done in initialisation. This means that attempts by backends to schedule it after a given module pass don't actually work as expected. * The heuristics used can impact codegen unexpectedly, so I worry that tweaking it to get the behaviour desired for promoted constants may lead to other issues. This may be completely tractable though. Now that #159352 has landed, the impact on terms if dynamically executed instructions is slightly smaller (as we are starting from a better baseline), but still worthwhile in lbm and nab from SPEC. Results below are for rva22u64: ``` Benchmark Baseline This PR Diff (%) ============================================================ ============================================================ 500.perlbench_r 180668945687 180666122417 -0.00% 502.gcc_r 221274522161 221277565086 0.00% 505.mcf_r 134656204033 134656204066 0.00% 508.namd_r 217646645332 216699783858 -0.44% 510.parest_r 291731988950 291916190776 0.06% 511.povray_r 30983594866 31107718817 0.40% 519.lbm_r 91217999812 87405361395 -4.18% 520.omnetpp_r 137699867177 137674535853 -0.02% 523.xalancbmk_r 284730719514 284734023366 0.00% 525.x264_r 379107521547 379100250568 -0.00% 526.blender_r 659391437610 659447919505 0.01% 531.deepsjeng_r 350038121654 350038121656 0.00% 538.imagick_r 238568674979 238560772162 -0.00% 541.leela_r 405660852855 405654701346 -0.00% 544.nab_r 398215801848 391352111262 -1.72% 557.xz_r 129832192047 129832192055 0.00% ``` --- Notes for reviewers: * As discussed at the sync-up meeting, the suggestion is to try to land an incremental improvement to the status quo even if there is more work to be done around the general issue of constant pool handling. We can discuss here if that is actually the best next step or not, but I just wanted to clarify that's why this is being posted with a somewhat narrow scope. * I've disabled transformations both for RV32 and on systems without D as both cases saw some regressions.
Fixes #165346 This patch renames stale variable names where `TypeSourceInfo` objects were still using the old `DI` (`DeclaratorInfo`) naming convention. Specifically, variables of type `TypeSourceInfo` have been updated from `DI` to `TSI` to improve code clarity and maintain consistency with the current naming.
The debug info attached to the BUNDLE is the first instruction in the BUNDLE, even if a better debug info (line:column) is present in the later instructions of the bundle. The patch tries to get a better debug info first. If not, then a worse debug info without line number is chosen. --------- Co-authored-by: Vladislav Dzhidzhoev <[email protected]> Co-authored-by: Orlando Cazalet-Hyams <[email protected]>
Move to the libcall impl based functions.
Update tests to contain auto generated checks.
PR #165993 accidentally broke the lowering of the `test.wait` Op. This patch fixes the issue and adds tests to verify the lowering to intrinsics for all mbarrier Ops, ensuring similar regressions are caught in the future. Additionally, the `cp-async-mbarrier` test is moved to the `mbarriers.mlir` test file to keep all related tests together. Signed-off-by: Durgadoss R <[email protected]>
#166536) …e size is unknown Keep _negative suffix only for test cases when the size is negative
… checks (#148810) This PR adds support for the NOTIFY specifier in the image selector as described in the 2023 standard, and add checks for the NOTIFY_TYPE type.
…ional (#166032) This picks up from #166028, making the `Function` argument optional: most cases don't need to provide it, but in e.g. InstCombine's case, where the instruction (select, branch) is not attached to a function yet, the function needs to be passed explicitly. Co-authored-by: Florian Hahn <[email protected]>
…166078) In the following example, `Functor::method()` inappropriately triggers a diagnostic that `outer()` is blocking by allocating memory. ``` void outer() [[clang::nonblocking]] { struct Functor { int* ptr; void method() { ptr = new int; } }; } ``` --------- Co-authored-by: Doug Wyatt <[email protected]>
…5630) When there's a deep inheritance hierarchy of multiple C++ classes (see below), then the mangled name of a VFTable can include multiple key nodes in the target name. For example, in the following code, MSVC will generate mangled names for the VFTables that have up to three key classes in the context. <details><summary>Code</summary> ```cpp class Base1 { virtual void a() {}; }; class Base2 { virtual void b() {} }; class Ind1 : public Base1 {}; class Ind2 : public Base1 {}; class A : public Ind1, public Ind2 {}; class Ind3 : public A {}; class Ind4 : public A {}; class B : public Ind3, public Ind4 {}; class Ind5 : public B {}; class Ind6 : public B {}; class C : public Ind5, public Ind6 {}; int main() { auto i = new C; } ``` </details> This will include `??_7C@@6BInd1@@ind4@@ind5@@@` (and every other combination). Microsoft's undname will demangle this to "const C::\`vftable'{for \`Ind1's \`Ind4's \`Ind5'}". Previously, LLVM would demangle this to "const C::\`vftable'{for \`Ind1'}". With this PR, the output of LLVM's undname will be identical to Microsoft's version. This changes `SpecialTableSymbolNode::TargetName` to a node array which contains each key from the name. Unlike namespaces, these keys are not in reverse order - they are in the same order as in the mangled name.
Fixes #145752 This PR inverts the result of `firstbithigh` when targeting DirectX by subtracting it from integer bitwidth - 1 to match the result from DXC. The result is not inverted if `firstbithigh` returned -1 or when targeting a backend other than DirectX.
…s still used outside of the block only If the current node is a copyable node and its parent is copyable too and still current node is only used outside, better to cancel scheduling for such node, because otherwise there might be wrong def-use chain built during vectorization. Fixes #166775
The paper is ensuring that a static_assert operand can not be deferred until runtime; it must accept an integer constant expression which is resolved at compile time. Note, Clang extends what it considers to be a valid integer constant expression, so this also verifies the expected extension diagnostics.
Introduce a common interface for operations with alignment attributes across MemRef, Vector, and SPIRV dialects. The interface exposes getMaybeAlign() to retrieve alignment as llvm::MaybeAlign. This is the second part of the PRs addressing issue #155677. Co-authored-by: Erick Ochoa Lopez <[email protected]>
This patch introduces a new way to reconstruct the thread stackframe list. New `SyntheticFrameProvider` classes can lazy fetch a StackFrame at index using a provided StackFrameList. In can either be the real unwinder StackFrameList or we could also chain SyntheticFrameProviders to each others. This is the foundation work to implement ScriptedFrameProviders, which will come in a follow-up patch. Signed-off-by: Med Ismail Bennani <[email protected]> Signed-off-by: Med Ismail Bennani <[email protected]>
…6676) There may be valid reasons for not being able to find an SDK. Right now, it's printed as an error, which is causing confusion for users that interpret the error as something fatal, and not something that can be ignored. rdar://155346799
This teases the SFINAE handling bits out of the CodeSynthesisContext, and moves that functionality into SFINAETrap and a new class. There is also a small performance benefit here: <img width="1460" height="20" alt="image" src="https://github.com/user-attachments/assets/aeb446e3-04c3-418e-83de-c80904c83574" />
…#166826) This eliminate some redundant code.
…166662) This patch implements the base and python interface for the ScriptedFrameProvider class. This is necessary to call python APIs from the ScriptedFrameProvider that will come in a follow-up. Signed-off-by: Med Ismail Bennani <[email protected]> Signed-off-by: Med Ismail Bennani <[email protected]>
This patch enhances HexagonQFPOptimizer in multiple ways:
1. Refactor the code for better readability and maintainability.
2. Optimize vabs,vneg and vilog2 converts
The three instruction mentioned can be optimized like below:
```v1.sf = v0.qf32
v2.qf = vneg v1.sf```
to
```v2.qf = vneg v0.qf32```
This optimization eliminates one conversion and is applicable
to both qf32 and qf16 types.
3. Enable vsub fusion with mixed arguments Previously, QFPOptimizer did
not fuse partial qfloat operands with vsub. This update allows selective
use of vsub_hf_mix, vsub_sf_mix, vsub_qf16_mix, and vsub_qf32_mix when
appropriate. It also enables QFP simplifications involving vector pair
subregisters.
Example scenario in a machine basic block targeting Hexagon: ```v1.qf32
= ... // result of a vadd
v2.sf = v1.qf32
v3.qf32 = vmpy(v2.sf, v2.sf)```
4. Remove redundant conversions Under certain conditions, we previously
bailed out before removing qf-to-sf/hf conversions. This patch removes
that bailout, enabling more aggressive elimination of unnecessary
conversions.
5. Don't optimize equals feeding into multiply: Removing converts
feeding into multiply loses precision. This patch avoids optimizing
multiplies along with giving the users an option to enable this by a
flag.
Patch By: Fateme Hosseini
Co-authored-by: Kaushik Kulkarni <[email protected]>
Co-authored-by: Santanu Das <[email protected]>
This is canonical in the rest of the repository and otherwise we can end up with warnings when compiling with clang-cl on Windows that look like the following: ``` 2025-11-06T17:55:25.2412502Z C:\_work\llvm-project\llvm-project\llvm\include\llvm/Support/thread.h(37,5): warning: 'LLVM_ON_UNIX' is not defined, evaluates to 0 [-Wundef] 2025-11-06T17:55:25.2413436Z 37 | #if LLVM_ON_UNIX || _WIN32 2025-11-06T17:55:25.2413791Z | ^ 2025-11-06T17:55:25.2414625Z C:\_work\llvm-project\llvm-project\llvm\include\llvm/Support/thread.h(52,5): warning: 'LLVM_ON_UNIX' is not defined, evaluates to 0 [-Wundef] 2025-11-06T17:55:25.2415585Z 52 | #if LLVM_ON_UNIX 2025-11-06T17:55:25.2415901Z | ^ 2025-11-06T17:55:25.2416169Z 2 warnings generated. ``` Reviewers: joker-eph, pcc, cachemeifyoucan Reviewed By: cachemeifyoucan Pull Request: #166827
We recently moved over to compiling with clang-cl on Windows. This ended up causing a large increase in warnings, particularly due to how warnings are handled in nanobind. cd91d0f initially set -Wall -Wextra and -Wpedantic while fixing another issue, which is probably not what we want to do on third-party code. We also need to disable -Wmissing-field-initializers to get things clean in this configuration. Reviewers: makslevental, jpienaar, rkayaith Reviewed By: makslevental Pull Request: #166828
We removed the limit a while back after moving to new infrastructure but never removed the comment. Do that now to prevent confusion.
…166005) This fixes two problems: - dyld itself resides within the shared cache. MemoryMappingLayout incorrectly computes the slide for dyld's segments, causing them to (appear to) overlap with other modules. This can cause symbolication issues. - The MemoryMappingLayout ranges on Darwin are not disjoint due to the fact that the LINKEDIT segments overlap for each module. We now ignore these segments to ensure the mapping is disjoint. This adds a check for disjointness, and a runtime warning if this is ever violated (as that suggests issues in the sanitizer memory mapping). There is now a test to ensure that these problems do not recur. rdar://163149325
…_v (#160607) Implemented [[*time.traits.is.clock*]](https://eel.is/c++draft/time.traits.is.clock) from [P0355R7](https://wg21.link/p0355r7). This patch implements the C++20 feature `is_clock` and `is_clock_v` based on the documentation [on cppreference](https://en.cppreference.com/w/cpp/chrono/is_clock.html) Fixes #166049.
This PR adds the necessary infrastructure to enable testing of the ACCImplicitData pass for FIR/HLFIR, along with comprehensive test coverage for implicit data clause generation in OpenACC constructs. New Infrastructure: - Add FIROpenACCSupport analysis providing FIR-specific implementations of OpenACCSupport interface methods for variable name extraction, recipe name generation, and NYI emission - Add FIROpenACCUtils with helper functions for: * Variable name extraction from FIR operations (getVariableName) * Recipe name generation with FIR type string representation * Bounds checking for constant array sections - Add ACCInitializeFIRAnalyses pass to pre-register FIR analyses (OpenACCSupport and AliasAnalysis) for use by subsequent OpenACC passes in the pipeline Refactoring in flang/lib/Lower/OpenACC.cpp: - Move bounds string generation and bounds checking to FIROpenACCUtils - Refactor recipe name generation to use fir::acc::getRecipeName Test Coverage: - acc-implicit-firstprivate.fir: Tests implicit firstprivate behavior for scalar types (i8, i16, i32, i64, f32, f64, logical, complex) in parallel/serial constructs with recipe generation verification - acc-implicit-data.fir: Tests implicit data clauses for scalars, arrays, derived types, and boxes in kernels/parallel/serial with default(none) and default(present) variations - acc-implicit-data-fortran.F90: Fortran tests verifying implicit data generation through bbc with both HLFIR and FIR - acc-implicit-data-derived-type-member.F90: Tests correct ordering of parent/child data clause operations for derived type members - acc-implicit-copy-reduction.fir: Tests enable-implicit-reduction-copy flag controlling whether reduction variables use copy or firstprivate This enables proper testing of implicit data clause generation through the flang optimizer pipeline for OpenACC directives.
Main executables were bypassing the locate module callback that shared libraries use, preventing custom symbol file location logic from working consistently. This PR fix this by * Adding target context to ModuleSpec * Leveraging that context to use target search path and platform's locate module callback in ModuleList::GetSharedModule This ensures both main executables and shared libraries get the same callback treatment for symbol file resolution. --------- Co-authored-by: George Hu <[email protected]> Co-authored-by: George Hu <[email protected]>
A global offset table is a section that holds the address of functions that are dynamically linked. The Swift plugin needs to know if sections are a global offset table or not.
Looks like #166517 is breaking libc-riscv32-qemu-yocto-fullbuild-dbg build due to failing overflow test for strfrom. https://lab.llvm.org/buildbot/#/changes/58668 ``` int result = func(buff, sizeof(buff), "%.2147483647f", 1.0f); EXPECT_LT(result, 0); ASSERT_ERRNO_FAILURE(); ``` ``` [ RUN ] LlvmLibcStrfromdTest.CharsWrittenOverflow /home/libcrv32buildbot/bbroot/libc-riscv32-qemu-yocto-fullbuild-dbg/llvm-project/libc/test/src/stdlib/StrfromTest.h:493: FAILURE Expected: result Which is: 0 To be less than: 0 Which is: 0 /home/libcrv32buildbot/bbroot/libc-riscv32-qemu-yocto-fullbuild-dbg/llvm-project/libc/test/src/stdlib/StrfromTest.h:494: FAILURE Expected: 0 Which is: 0 To be not equal to: static_cast<int>(libc_errno) Which is: 0 [ FAILED ] LlvmLibcStrfromdTest.CharsWrittenOverflow Ran 8 tests. PASS: 7 FAIL: 1 ``` At first glance it seem like there is some kind of overflow in internal::strfromfloat_convert on 32bit archs because the other overflow test case is passing for snprintf. Interestingly, it passes on all other buildbots, including libc-arm32-qemu-debian-dbg. This issue likely wasn't introduced by #166517 and was probably already present, so I'm not reverting the change just disabling the test case on riscv32 until I can debug properly.
…de braced initializers (#166180) Fixes #163498 --- This PR addresses the issue of confusing diagnostics for lambdas with init-captures appearing inside braced initializers. Cases such as: ```cpp S s{[a(42), &] {}}; ``` were misparsed as C99 array designators, producing unrelated diagnostics, such as `use of undeclared identifier 'a'`, and `expected ']'` --- https://github.com/llvm/llvm-project/blob/bb9bd5f263226840194b28457ddf9861986db51f/clang/lib/Parse/ParseInit.cpp#L470 https://github.com/llvm/llvm-project/blob/bb9bd5f263226840194b28457ddf9861986db51f/clang/lib/Parse/ParseInit.cpp#L74 https://github.com/llvm/llvm-project/blob/bb9bd5f263226840194b28457ddf9861986db51f/clang/include/clang/Parse/Parser.h#L4652-L4655 https://github.com/llvm/llvm-project/blob/24c22b7de620669aed9da28de323309c44a58244/clang/lib/Parse/ParseExprCXX.cpp#L871-L879 The tentative parser now returns `Incomplete` for partially valid lambda introducers, preserving the `lambda` interpretation and allowing the proper diagnostic to be issued later. --- Clang now correctly recognizes such constructs as malformed lambda introducers and emits the expected diagnostic — for example, “capture-default must be first” — consistent with direct initialization cases such as: ```cpp S s([a(42), &] {}); ```
The low hanging fruit that was causing the vast majority of these warnings has been fixed, so reenable them now. There are still a couple more warnings that could probably do with some cleanup, but those can be fixed in the future.
This allows SDNodes to be validated against their expected type profiles and reduces the number of changes required to add a new node. Fix BR_CC/MEMCPY descriptions to match C++ code that creates the nodes (an error detected by the enabled verification functionality). Also remove redundant `SDNPOutGlue` on `BPFISD::MEMCPY`. Part of #119709.
…162822) Check that all partial reductions in a chain are only used by other partial reductions with the same scale factor. Otherwise we end up creating users of scaled reductions where the types of the other operands don't match. A similar issue was addressed in #158603, but misses the chained cases. Fixes #162530. PR: #162822
…tions (#166776) Seeing warnings: llvm/include/llvm/CodeGen/LibcallLoweringInfo.h:15:46: error: 'visibility' attribute ignored [-Werror=attributes] 15 | LLVM_ABI const RTLIB::RuntimeLibcallsInfo &RTLCI; llvm/include/llvm/CodeGen/LibcallLoweringInfo.h:18:25: error: 'visibility' attribute ignored [-Werror=attributes] 18 | RTLIB::Unsupported};
Fix shared library linking failure for FIROpenACCTransforms
…#166674) Remove the unnecessary sleep in MachProcess::AttachForDebug. The preceding comment makes it seem like it's necessary for synchronization, though I don't believe that's the case (see below), and even if it were, sleeping is not a reliable way to achieve that. The reason I don't believe it's necessary is because after we return, we synchronize with the exception thread on a state change. The latter will call and update the process state, which is exactly what we synchronize on. I was able to verify that this is the first time we change the process state: i.e., `GetState` doesn't return a different value before and after the sleep. On top of that, there are 3 more places where we call ptrace attach (`PosixSpawnChildForPTraceDebugging`, `SBLaunchForDebug`, and `BoardServiceLaunchForDebug`) where we don't sleep. rdar://163952037
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )